78 research outputs found
AspectMMKG: A Multi-modal Knowledge Graph with Aspect-aware Entities
Multi-modal knowledge graphs (MMKGs) combine different modal data (e.g., text
and image) for a comprehensive understanding of entities. Despite the recent
progress of large-scale MMKGs, existing MMKGs neglect the multi-aspect nature
of entities, limiting the ability to comprehend entities from various
perspectives. In this paper, we construct AspectMMKG, the first MMKG with
aspect-related images by matching images to different entity aspects.
Specifically, we collect aspect-related images from a knowledge base, and
further extract aspect-related sentences from the knowledge base as queries to
retrieve a large number of aspect-related images via an online image search
engine. Finally, AspectMMKG contains 2,380 entities, 18,139 entity aspects, and
645,383 aspect-related images. We demonstrate the usability of AspectMMKG in
entity aspect linking (EAL) downstream task and show that previous EAL models
achieve a new state-of-the-art performance with the help of AspectMMKG. To
facilitate the research on aspect-related MMKG, we further propose an
aspect-related image retrieval (AIR) model, that aims to correct and expand
aspect-related images in AspectMMKG. We train an AIR model to learn the
relationship between entity image and entity aspect-related images by
incorporating entity image, aspect, and aspect image information. Experimental
results indicate that the AIR model could retrieve suitable images for a given
entity w.r.t different aspects.Comment: Accepted by CIKM 202
Understanding Translationese in Cross-Lingual Summarization
Given a document in a source language, cross-lingual summarization (CLS) aims
at generating a concise summary in a different target language. Unlike
monolingual summarization (MS), naturally occurring source-language documents
paired with target-language summaries are rare. To collect large-scale CLS
data, existing datasets typically involve translation in their creation.
However, the translated text is distinguished from the text originally written
in that language, i.e., translationese. In this paper, we first confirm that
different approaches of constructing CLS datasets will lead to different
degrees of translationese. Then we systematically investigate how
translationese affects CLS model evaluation and performance when it appears in
source documents or target summaries. In detail, we find that (1) the
translationese in documents or summaries of test sets might lead to the
discrepancy between human judgment and automatic evaluation; (2) the
translationese in training sets would harm model performance in real-world
applications; (3) though machine-translated documents involve translationese,
they are very useful for building CLS systems on low-resource languages under
specific training strategies. Lastly, we give suggestions for future CLS
research including dataset and model developments. We hope that our work could
let researchers notice the phenomenon of translationese in CLS and take it into
account in the future.Comment: Accepted to the Findings of EMNLP 202
Zero-Shot Cross-Lingual Summarization via Large Language Models
Given a document in a source language, cross-lingual summarization (CLS) aims
to generate a summary in a different target language. Recently, the emergence
of Large Language Models (LLMs), such as GPT-3.5, ChatGPT and GPT-4, has
attracted wide attention from the computational linguistics community. However,
it is not yet known the performance of LLMs on CLS. In this report, we
empirically use various prompts to guide LLMs to perform zero-shot CLS from
different paradigms (i.e., end-to-end and pipeline), and provide a preliminary
evaluation on the generated summaries. We find that ChatGPT and GPT-4
originally prefer to produce lengthy summaries with detailed information. These
two LLMs can further balance informativeness and conciseness with the help of
an interactive prompt, significantly improving their CLS performance.
Experimental results on three widely-used CLS datasets show that GPT-4 achieves
state-of-the-art zero-shot CLS performance, and performs competitively compared
with the fine-tuned mBART-50. Moreover, we also find some multi-lingual and
bilingual LLMs (i.e., BLOOMZ, ChatGLM-6B, Vicuna-13B and ChatYuan) have limited
zero-shot CLS ability. Due to the composite nature of CLS, which requires
models to perform summarization and translation simultaneously, accomplishing
this task in a zero-shot manner is even a challenge for LLMs. Therefore, we
sincerely hope and recommend future LLM research could use CLS as a testbed.Comment: Technical Report, 11 page
Snowman: A Million-scale Chinese Commonsense Knowledge Graph Distilled from Foundation Model
Constructing commonsense knowledge graphs (CKGs) has attracted wide research
attention due to its significant importance in cognitive intelligence.
Nevertheless, existing CKGs are typically oriented to English, limiting the
research in non-English languages. Meanwhile, the emergence of foundation
models like ChatGPT and GPT-4 has shown promising intelligence with the help of
reinforcement learning from human feedback. Under the background, in this
paper, we utilize foundation models to construct a Chinese CKG, named Snowman.
Specifically, we distill different types of commonsense head items from
ChatGPT, and continue to use it to collect tail items with respect to the head
items and pre-defined relations. Based on the preliminary analysis, we find the
negative commonsense knowledge distilled by ChatGPT achieves lower human
acceptance compared to other knowledge. Therefore, we design a simple yet
effective self-instruct filtering strategy to filter out invalid negative
commonsense. Overall, the constructed Snowman covers more than ten million
Chinese commonsense triples, making it the largest Chinese CKG. Moreover, human
studies show the acceptance of Snowman achieves 90.6\%, indicating the
high-quality triples distilled by the cutting-edge foundation model. We also
conduct experiments on commonsense knowledge models to show the usability and
effectiveness of our Snowman.Comment: tech repor
Is ChatGPT a Good NLG Evaluator? A Preliminary Study
Recently, the emergence of ChatGPT has attracted wide attention from the
computational linguistics community. Many prior studies have shown that ChatGPT
achieves remarkable performance on various NLP tasks in terms of automatic
evaluation metrics. However, the ability of ChatGPT to serve as an evaluation
metric is still underexplored. Considering assessing the quality of natural
language generation (NLG) models is an arduous task and NLG metrics notoriously
show their poor correlation with human judgments, we wonder whether ChatGPT is
a good NLG evaluation metric. In this report, we provide a preliminary
meta-evaluation on ChatGPT to show its reliability as an NLG metric. In detail,
we regard ChatGPT as a human evaluator and give task-specific (e.g.,
summarization) and aspect-specific (e.g., relevance) instruction to prompt
ChatGPT to evaluate the generated results of NLG models. We conduct experiments
on five NLG meta-evaluation datasets (including summarization, story generation
and data-to-text tasks). Experimental results show that compared with previous
automatic metrics, ChatGPT achieves state-of-the-art or competitive correlation
with human judgments in most cases. In addition, we find that the effectiveness
of the ChatGPT evaluator might be influenced by the creation method of the
meta-evaluation datasets. For the meta-evaluation datasets which are created
greatly depending on the reference and thus are biased, the ChatGPT evaluator
might lose its effectiveness. We hope our preliminary study could prompt the
emergence of a general-purposed reliable NLG metric.Comment: Both first authors contributed equally. Technical Report, 11 pages.
Accepted to the 4th New Frontiers in Summarization Workshop (NewSumm@EMNLP
2023
High-Resolution Boundary Detection for Medical Image Segmentation with Piece-Wise Two-Sample T-Test Augmented Loss
Deep learning methods have contributed substantially to the rapid advancement
of medical image segmentation, the quality of which relies on the suitable
design of loss functions. Popular loss functions, including the cross-entropy
and dice losses, often fall short of boundary detection, thereby limiting
high-resolution downstream applications such as automated diagnoses and
procedures. We developed a novel loss function that is tailored to reflect the
boundary information to enhance the boundary detection. As the contrast between
segmentation and background regions along the classification boundary naturally
induces heterogeneity over the pixels, we propose the piece-wise two-sample
t-test augmented (PTA) loss that is infused with the statistical test for such
heterogeneity. We demonstrate the improved boundary detection power of the PTA
loss compared to benchmark losses without a t-test component
Anti-icing property of bio-inspired micro-structure superhydrophobic surfaces and heat transfer model
Ice accumulation is a thorny problem which may inflict serious damage even disasters in many areas, such as aircraft, power line maintenance, offshore oil platform and locators of ships. Recent researches have shed light on some promising bio-inspired anti-icing strategies to solve this problem. Inspired by typical plant surfaces with super-hydrophobic character such as lotus leaves and rose petals, structured superhydrophobic surface are prepared to discuss the anti-icing property. 7075 Al alloy, an extensively used materials in aircrafts and marine vessels, is employed as the substrates. As-prepared surfaces are acquired by laser processing after being modified by stearic acid for 1 h at room temperature. The surface morphology, chemical composition and wettability are characterized by means of SEM, XPS, Fourier transform infrared (FTIR) spectroscopy and contact angle measurements. The morphologies of structured as-prepared samples include round hump, square protuberance and mountain-range-like structure, and that the as-prepared structured surfaces shows an excellent superhydrophobic property with a WCA as high as 166 ± 2°. Furthermore, the anti-icing property of as-prepared surfaces was tested by a self-established apparatus, and the crystallization process of a cooling water on the sample was recorded. More importantly, we introduced a model to analyze heat transfer process between the droplet and the structured surfaces. This study offers an insight into understanding the heat transfer process of the superhydrophobic surface, so as to further research about its unique property against ice accumulation
Numerical Analysis on Performance of Different Ceiling Coverage of fan Filter Units in Semiconductor Fabs
In practical semiconductor fabs engineering, in order to reduce the investment cost, low coverage rate of FFUs is usually adopted and the air supply velocity is high, which brings high resistance to the filter mounted at FFU’s outlet. If a higher ceiling coverage and lower air velocity speed are used, the resistance may be significantly reduced and remarkable energy saving benefits can be obtained. In this study, CFD technology was adopted to simulate airflow in a semiconductor fab and air circulation resistance was obtained by theoretical calculation. Four ceiling coverage of FFUs, 25%, 50%, 75%, 100%, were studied under the condition of same air volume. The particle concentrations and payback period were analysed. The results show that (1) with the increase of the coverage rate, the concentration in most areas decreases significantly while only a small increase in the local area around occupant, and the particle concentration still meets the requirement; (2) adopting high coverage rate for transformation, the initial investment of FFUs increases slightly and the operating cost decreases significantly, and the payback period is only 1.1-2.3 years when 25% coverage rate is transformed into 37.5% - 75%
Health among the oldest-old in China: Which living arrangements make a difference?
This study aims to (1) examine the association of living arrangements and health among oldest-old Chinese, and (2) investigate gender differences in the association of living arrangements and health. Data were from the first two waves of the Chinese Longitudinal Healthy Longevity Survey, which included 9093 Chinese averaging 92 years old. Living arrangements had six mutually exclusive categories: living alone, with spouse, with children, with spouse and children, with others and in institutions. Using multinomial logistic regression, we found that baseline living arrangements are significantly associated with mortality, activities of daily living (ADL) disability, and self-rated health at Wave 2, controlling for baseline health, sociodemographic characteristics and availability of children. Further, the linkages between living arrangements and mortality vary by gender. Among the different living arrangements, having a spouse in the household (either with a spouse only or with both a spouse and children) provides the best health protection. Living alone and living with children are associated with both health advantages and disadvantages. Institutional living lowers mortality risk for men but not women. Living with others provides the least health benefits. Our study has extended the research on living arrangements and health to a unique population--the oldest-old in China--and clarified the health advantages and disadvantages of different living arrangements. Future research should examine the mechanisms linking living arrangements and health, and the experience of institutional living for men and women in China.China Oldest-old Living arrangements Mortality Gender
- …